Generation of diversity in combinatorial libraries

ABSTRACT

The invention concerns gene banks and combinatorial derivatives thereof, prepared using phagemid- or phage-display in combination with type IIS restriction enzymes and cosmid packaging; their use for the isolation of ligands, including enzyme inhibitors, agonists and antagonists for receptors, competitive binding peptides to a defined target, diagnostic ligands for diseases and autoimmune syndromes, including surveillance tools for immune status, post-translationally modified peptides, and such ligands generated by this technology.

The present application is a divisional application of U.S. applicationSer. No. 09/364,707, which was filed-Jul. 30, 1999 and issued as U.S.Pat. No. 6,310,191, which in turn was a continuation of PCT/EP98/00533,and was filed Feb. 2, 1998 and claimed priority to EP 97 101 539 whichwas filed Jan. 31, 1997.

Biotech evolutionary methods, including combinatorial libraries andphage-display technology (PARMLEY & SMITH 1988; SCOTT & SMITH 1990;SMITH 1993), are used in the search for novel ligands of diagnostic,biomedical and pharmaceutical use (reviews; CORTESE 1996; COLLINS 1997).These methods, which use empirical procedures to select molecules withrequired characteristics, e.g. binding properties, from largepopulations of variant gene products has been compared to the process ofnatural evolution. Evolution includes the generation of mutation,selection of fuinctionality over a time period and the ability of thesystems to self-replicate. In particular natural systems userecombination to reassort mutations accumulated in the selectedpopulation to exponentially increase the combinations of mutations andthus increase the number of variants in the population. This latteraspect, namely the introduction of recombination within mutant genes hasonly recently been applied to biotech evolutionary methods, although ithas been used to increase the size of initial phage-display libraries(e.g. WATERHOUSE 1993; TSURUSHITA 1996; SODOYER 1994; FISCH 1996).STEMMER 1994a, 1994b and 1995 teach that recombination amongst apopulation of DNA molecules can be achieved in vitro by PCRamplification of a mixture of small overlapping fragments with (1994a,1994b) or without (STEMMER 1995) primer oligonucleotide sequences beingused to drive the PCR reaction. The method is not applicable torecombination within a fully randomized (highly mutated) sequence sincethe method relies on high homology of the overlapping sequences at thesite of recombination. STEMMER 1994b and CRAMERI 1996a do, however,demonstrate the usefulness of in vitro recombination for molecularevolution, where CRAMERI 1996b also demonstrate the use of the method inconjunction with phage-display, even though their method is confined toregions of low mutant density (ca. 0.5-1% of the bases are mutated intheir method) as they state “the advantages of recombination overexisting mutagenesis methods are likely to increase with the numbers ofcycles of molecular evolution” (STEMMER 1994b). We point out that thisis due to the self-evident fact that the number of variants created bymutagenesis introducing base changes in existing mutant structures is anadditive i.e., a linearly increasing fimction, whereas the use ofrecombination between mutated variants yields novel variants as anexponential function of the initial number of variants. The classicalphage-display libraries are thus at a grave disadvantage for thegeneration of novel variants; e.g. to encompass all the possiblevariants of an octapeptide sequence 20⁸=2.56×10¹⁰ different variantswould be required.

MARKS 1992 state the importance of recombination in the generation ofhigher specificity in combinatorial libraries e.g. in attainingantibodies of higher specificity and binding constants in the form ofreshuffling light and heavy chains of immunoglobulins displayed inphage-display libraries. These authors do not instruct how the shufflingof all the light and heavy chains in a population heterogeneous in bothchains can be achieved, e.g. by a vector allowing recombination. Heavyand light chains were selected one after the other, i.e. an optimalheavy chain first selected from a heterogeneous heavy chain populationin the presence of a constant light chain, then by preparing a newlibrary, an optimal light chain in combination with the preselectedoptimal heavy chain. The extensive time consuming sequentialoptimization strategies currently utilized includingconsensus-mutational libraries, in vivo mutagenesis, error-pone PCR aswell as chain shuffling are summarized in FIGS. 5 and 6 of COLLINS 1997.

General background to phage and phage-display libraries

Gene libraries are generated containing extremely large number (10⁶ to10¹⁰) of variants. The variant gene segments are fused to a coat proteingene of a filamentous bacteriophage (e.g. M13, fd or fl), and the fusiongene is inserted into the genome of the phage or of a phagemid. Aphagemid is defined as a plasmid containing the packaging andreplication origin of the filamentous bacteriophage. This latterproperty allows the packaging of the phagemid genome into a phage coatwhen it is present in an Escherichia coli host strain infected with afilamentous phage (superinfection). The packaged particles produced, bethey phage or phagemid, display the fusion protein on the surface of theparticles secreted into the medium. Such packaged particles are able toinject their genomes into a new host bacterium, where they can bepropagated as phage or plasmids, respectively. The special property ofthe system lies in the fact that since the packaging takes place inindividual cells usually infected by a single variant phage/phagemid,the particles produced on propogation contain the gene encoding theparticular variant displayed on the particle's surface. Several cyclesof affinity selection for clones exhibiting the required properties dueto the particular property of the variant protein displayed, e.g.binding to a particular target molecule immobilized on a surface,followed by amplification of the enriched clones leads to the isolationof a small number of different clones having these properties. Theprimary structure of these variants can then be rapidly elucidated bysequencing the hypermutated segment of the variant gene.

Efficiency of producing combinatorial libraries

There are a number of factors which limit the potential of thistechnology. The first is the number and diversity of the variants whichcan be generated in the primary library. Most libraries have beengenerated by transformation of ligated DNA preparations into Escherichiacoli by electroporation. This gives an efficiency of ca. 0.1 to 1×106recombinants/microgram-ligated phage DNA. The highest cloning efficiencyreported (of 107 recombinants per microgram insert DNA) is obtainedusing special lambda vectors into which a single filamentous phagevector is inserted, in a special cloning site, bracketted by aduplication of the filamentous phage replication/packaging origin(AMBERG 1993; HOGREFE 1993a+b). The DNA construct is efficientlyintroduced into the Escherichia coli host after packaging into a lambdabacteriophage coat in an in vitro lambda packaging mix. Infection of astrain carrying such a hybrid phagemid by an M13- helper phage allowsexcision and secretion of the insert packed in a filamentous phage coat.Neither AMBERG 1993 nor HOGREFE 1993a+b instruct on how the method maybe used to introduce recombination during this procedure. Although theymention that the efficiency may be improved by the use of type IISrestriction endonucleases during the construction of the concatemersused as substrate for the in vitro packaging no examples are given andin the ensuing five years no examples have appeared in the literature.The procedure described in our invention also uses the high efficiencyof the in vitro lambda packaging, but maximizes the capacity of thecloning vector by using a cosmid vector (8) in which many copies (say 8)of the phagemid are inserted in each construct. One of the surprisinginnovative aspects of this procedure is the discovery pf a number ofprotocols for the de novo synthesis of large hypervariable libraries.One type is particularly efficient, in that phagemid/cosmid vectors areforced to integrate into the hybrid concatamers oriented in the sameorientation. Any variant of the protocol which does not ensure thisfeature does not work efficiently.

The use of tvpe IIS restriction endonucleases

SZYBALSKI 1991 teaches a large number of novel applications for type IISrestriction endonucleases, including precise trimming of DNA, retrievalof cloned DNA, gene assembly, use as a universal restriction enzyme,cleavage of single-stranded .-DNA, detection of point mutations, tandemamplification, printing amplification reactions and localization ofmethylated bases. They do not give any instruction as to how suchenzymes can be used in the creation of recombination within highlymutated regions, e.g. within a combinatorial library.

Reference list

Amberg, J, Hogrefe, H., Lovejoy, H., Hay, B., Shopes, B, Mullinax, R.and Sorge, J.A. (1993), Strategies, 5, 2-3.

Collins, J. (1997) Phage display. In Moos, W.H. et al. (eds) Annualreports in combinatorial chemistry and molecular diversity. Vol. 1.,ESCOM Science publ., Leiden. pp. 210-262.

Cortese, R. (ed.) (1996) Combinatorial libraries: Synthesis, Screeningand Application potential. Walter de Gruyter, Berlin.

Crameri, A., Whitehom, E. A., Tate, E. and Stemmer, W. P. C. (1996a) 14,315-319

Crameri, A., Cwirla, S. and Stemmer, W. P. C. (1996b) Nat. Med. 2, pg.100

Fisch, I., Kontermarnn, R. E., Finnem, R., Hartley, O., Soler-Gonzalez,A. S., Griffiths, A. D. and Winter, G. (1996) Proc. Natn. Acad. Sci.USA. 93, 7761.

Marks, J. D.; Griffiths, A. D.; Malmqvist, M.; Clackson, T. P.; Bye, J.M. and Winter, G. (1992) BioTechnol. 10, 779-783.

Hogrefe, H. H., Amberg, J. R., Hay, B. N., Sorge, J. A. and Shopes, B.(1993) Gene, 137, 85-91.

Hogrefe, H. H., Mullinax, R. L., Lovejoy, A. E., Hay, B. N. and Sorge,J. A. (1993) Gene 128, 119-126

Parmley, S. F. and Smith, G. P. (1988) Gene 73, 305-318

Scott, J. K. and Smith, G. P. (1990) Science 249, 386-390

Smith, G. P. (1993) Gen{overscore (e)} 1{overscore (2)}8, 1-2.

Sodoyer, R., Aujume, L., Geoffrey, F., Pion, C., Puebez, I., Montegue,B., Jacquemot, P. and Dubayle, J. (1996) In Kay, B. K. et al. (eds.)Phage display of peptides and proteins. A laboratory manual. AcademicPress, San Diego. Pp. 215-226

Stemmer, W. P. C. (1994a) Nature (Lond.) 370, 389-391

Stemmer, W. P. C. (1994b) Proc. Nat.. Acad. Sci. USA, 91, 10747-10751

Stemmer, W. P. C. (1995) Gene 164, 49-53

Szybalski, W., Kim, S. C., Hasan, N. and Podhajska, A. J. (1991) Gene,100, 13-26.

Tsurushita,-M., Fu, H. and Warren, C. (1996) Gene, 172, 59.

Waterhouse, P., Griffiths, A. D., Johnson, K. S. and Winter, G. (1993a)Nucleic Acid Res. 2265-2269

According to a first embodiment the invention concerns a bank of genes,wherein said genes comprise a double stranded DNA sequence which isrepresented by the following formula of one of their strands:5′B₁B₂B₃ . . . B_(n)X_(n+1) . . . X_(n+a)Z_(n+a+1)Z_(n+a+2)X_(n+a+3) . .. X_(n+a+b)Q_(n+a+b+1) . . . Q_(n+a+b+j)3′

wherein n, a, b and j are integers andn>3, a>1, b>3 and j>1,

wherein X_(n+1) . . . X_(n+a+b) is a hypervariable sequence and B, X, Zand Q represent adenine (A), cytosine (C), guanine (G) or thymine (T),

(i) Z represents G or T at a G:T ratio of about 1:1, and/or

(ii) Z represents C or T at a C:T ratio of about 1:1, and/or

(iii) Z represents A or G at a A:G ratio of about 1:1, and/or

(iv) Z represents A or C at a A:C ratio of about 1:1, and wherein

subsequences B₁ . . . B_(n) and/or Q_(n+a+b+1) . . . Q_(n+a+b+j)represent recognition sites for restriction enzymes, and wherein therecognition sites are oriented such that their cleavage site uponcleavage generates a cohesive end including the two bases designated Z.

Restriction of this sequence with a type IIS restriction enzyme as thusdescribed, followed by religation leads to the recombination of thehypervariable regions located 5′ and 3′ of the cleavage site. This isthe essence of the methodology which we designate “cosmix-plexing”. Itis essential in this procedure that the fragments generated on cleavageby the restriction enzyme are religated in the correct orientation(“head-to-tail”), whereby the Z sequences are chosen for the fourlibraries ((i) to (iv)) so as to ensure this (see below) yet stillallowing all possible amino-acids to be encoded at the cleavage site. Ifthis correct orientation is not ensured there will be a drasticreduction in both the percent of correctly reconstituted fusion-proteingenes, a reduction in the proportion of molecules which can be packagedin vitro in the lambda-packaging extracts (which requires the correctorientation of the cos-sites), as well as a reduction in the proportionof in vivo excisable phagemid copies from the cosmid concatemer(excision requires the correct orientation of consecutive phagereplication origins).

correct orientation correct orientation incorrect

orientation (head-to-head ligation) 5′------>XGG/x----->-------->XCC/x----> ------- >XGG/Y----> 3′--------Y/CCy<  ---------------Y/GGy<  ---     ----- ----Y/CCX<  ---

To prevent the problems arising from false orientation (head-to-head)mentioned in the previous paragraph, the four gene libraries mentionedin claim must be kept separated during cosmix-plexing. In fact withrespect to the formation of recombinants the libraries behave as 16separate sets which cannot recombine with each other: four librariesmaintained separately, where each set contains four possible cohesiveends, e.g. library (i) with Z=G or T contains: 5′ ---->XGT/Y---->,----->XGG/Y----->, ---->XTG/Y---->, and ---->XTT/Y-----> 3′ ------y/CAx< --     -------y/CC x< ----- -----y/AC x< - ---           ------y/AAx< ----

It is evident that problems of false orientation will arise on mixingthe different libraries, e.g. the AC library (iv) will contain AA, AC,CA and CC sequences which can pair in the false orientation with,respectively each of the cohesive ends generated in library (i).

A specific embodiment of the invention concerns a bank of genes whereinsubsequences B₁ . . . B_(n) or Q_(n+a+b+1) . . . Q_(n+a+b+j) representrecognition sites for restriction enzymes and wherein the recognitionsites are orientated such that their cleavage site upon cleavagegenerates a cohesive end including the two bases designated Z.

Further, a specific embodiment concerns a bank of genes, wherein thecohesive end is a 2 bp single strand end formed by the two basesdesignated Z.

Further, a specific embodiment concerns a bank of genes wherein eachgene is provided as display vector, especially as M13 phage or M13-likephage or as phagemid.

Another embodiment of the invention concerns a set of four gene banksaccording to the invention wherein the gene banks are characterized asfollows:

first gene bank: Z represents G or T, preferentially at a G:T ratio ofabout 1:1;

second gene bank: Z represents C or T, preferentially at a C:T ratio ofabout 1:1;

third gene bank: Z represents A or G, preferentially at a A:G ratio ofabout 1:1; and

fourth gene bank: Z represents A or C, preferentially at a A:C ratio ofabout 1:1.

A specific embodiment of the invention concerns a set, of four genebanks wherein each gene is provided as display vector, especially as M13phage or M13-like phage or as phagemid.

Another embodiment of the invention concerns a bank of genes whereinsaid genes comprise a double stranded DNA sequence which is representedby the following formula of one of their strands:5′B₁B₂B₃ . . . B_(n)X_(n+1) . . . X_(n+a)Z_(n+a+1)Z_(n+a+2)X_(n+a+3) . .. X_(n+a+b)Q_(n+a+b+1) . . . Q_(n+a+b+j)3′

wherein n, a, b and j are integers andn>3, a>1, b>3 and i>1,

wherein X_(n+1) . . . X_(n+a+b) is a hypervariable sequence and B, X, Zand Q represent adenine (A), cytosine (C), guanine (G) or thymine (T),and wherein

four sets of oligonucleotide sequences comprising Z_(n+a+1) andZ_(n+a+2) are present, preferentially at a ratio of (i):(ii):(iv) ofabout 1:1:2:2, wherein the four sets are characterized as follows:

first set: Z_(n+a+1) represents G and Z_(n+a+2) also represents G;

second set: Z_(n+a+1) represents C and Z_(n+a+2) represents T;

third set: Z_(n+a+1) represents A and Z_(n+a+2) represents A or C,preferentially at A:C ratio of about 1:1; and

fourth set: Z_(n+a+1) represents T and Z_(n+a+2) represents C or G,preferentially at a C:G ratio of about 1:1, and wherein sequences B₁ . .. B_(n) and/or Q_(n+a+b+1) . . . Q_(n+a+b+j) represent recognition sitesfor restriction enzymes, wherein the recognition sites are orientatedsuch that their cleavage site upon cleavage generates a cohesive endincluding the two bases designated Z.

A specific embodiment of the invention concerns a bank of genes whereinthe four sets of oligonucleotide sequences are present at a ratio of(i):(ii):(iii):(iv) of (0 to 1):(0 to 1):(0 to 1):(0 to 1) with theproviso that at least one of said sets is present.

Further, a specific embodiment of the invention concerns a bank of geneswherein subsequences B₁ . . . B_(n) and/or Q_(n+a+b+1) . . . Q_(n+a+b+j)represent recognition sites for restriction enzymes and wherein therecognition sites are orientated such that their cleavage site uponcleavage generates a cohesive end including the two bases designated Z.

Further, a specific embodiment of the invention concerns a bank- ofgenes wherein the cohesive end is a 2 bp single strand end formed by thetwo bases designated Z.

Another embodiment of the invention concerns bank of genes wherein saidgenes comprise a double stranded DNA sequence which is represented bythe following formular of one of their strands:5′B₁B₂B₃ . . . B_(n)X_(n+1) . . . X_(n+a)Z_(n+a+1)Zn+a+2X_(n+a+3) . . .X_(n+a+b)Q_(n+a+b+1) . . . Q_(n+a+b+j)3′

wherein n, a, b and j are integers andn>3, a>1, b >3 and j>1,

wherein X_(n+1) . . . X_(n+a+b) is a hypervariable sequence and B, X, Zand Q represent adenine (A), cytosine (C), guanine (G) or thyrnine (T),and wherein the following six sets of oligonucleotide sequencescomprising X_(n+a), Z_(n+a+1) and Z_(n+a+2) are present, preferably at aratio of (i):(ii):(iii):(iv):(v):(vi) of about 3:4:3:4:4:1, wherein thesix sets are characterized as follows:

first set: X_(n+a) represents A, G and/or T, preferentially at a ratioof about 1:1:1 or X_(n+a) represents C, G and/or T, preferentially at aratio of about 1:1:1, Z_(n+a+1) represents G and Z_(n+a+2) represents G;

second set: X_(n+a) represents A, C, G and/or T, preferentially at aratio of about 1:1:1:1, Z_(n+a+1) represents C and Z_(n+a+2) representsT;

third set: X_(n+a) represents A, C and/or G, preferentially at a ratioof about 1:1:1, Z_(n+a+1) represents A and Z_(n+a+2) represents A;

fourth set: X_(n+a) represents A, C, G and/or T, preferentially at aratio of about 1:1:1:1, Z_(n+a+1) represents A and Z_(n+a+2) representsC;

fifth set: X_(n+a) represents A, C, G and/or T, preferentially at aratio of about 1:1:1:1, Z_(n+a+1) represents T and Z_(n+a+2) representsC;

sixth set: X_(n+a) represents A, Z_(n+a+1) represents T and Z_(n+a+2)represents G.

“Single-tube” method

Problem

A method should be developed which allows cosmix-plexing withoutmaintaining separate libraries. This would have the advantage ofreducing manipulation, involved in screening the four separatelibraries, as previously described. This would offer a saving in bothtime and materials. This has been achieved in two separate versions ofthe invention.

Solution

It is possible to select combinations of nucleotides within the cohesiveends generated by type IIS restriction within the aforementionedsequence, i.e. ZZ, in which all the clones are present in a singlelibrary and in which the possibility of false orientation duringligation, and the associated loss of efficiency associated with this, iseliminated. At the same time the number of subsets, defined by thenumber of different cohesive ends which can be generated, which cannotinteract (recombine) with each other, is reduced from the 16 sets, as inthe previously described version of the method, to6.

Designing the sequences

The combinations of 2 bp single-strand cohesive end sequences which canbe generated at ZZ are theoretically as follows: AA CA GA TA AC CC GC TCAG CG GG TG AT CT GT TT

Of these, the sequences with an inverted symmetry axis (palindromes: AT,TA, GC, CG), can pair in both orientations and are thus to be eliminatedfrom cosmix-plexing libraries for the reasons given above. The remaining12 sequences are actually 6 sets of complementary pairs (e.g. CC+GG,AA+TT, CA+TG). By choosing one partner from each pair (total of 6) asingle set of cohesive ends can be generated which can pair only in thecorrect “head-to-tail” orientation. The actual choice of sequences takesthe codon usage into account, assuming that ZZ are chosen as the 2nd and3rd position of the codon. Determining are the amino-acids which areencoded by either a single or only two codons (single codon methionine(TG) and tryptophan (GG); after elimination of the palindromic sequencesthere also only'single codons available encoding aspartic acid (Asp),asparagine (Asn), cystine (Cys), histidine (His) and tyrosine (Tyr). Toencode Asp, Asn, His and Tyr an AC sequence is required. Selecting AChas the default that the complimentary sequence GT must be avoided. Thisis the only possibility of encoding Cys. However, the inclusion of Cyswithin the hypervariable sequence often causes problems of misfoldingand the formation of dimeric aggregates, dependent on the redoxpotential of the environment. It was thus decided to create a set inwhich Cys codons are eliminated, but which will be of great use in manyapplications, including cyclic peptide library formation. If thesequence AA is chosen to encode glutamic acid (Glu), glutamine (Gln) andlysine (Lys) also allowing the stop-codon TAA, then TT must beeliminated. The consequence of this is that TC must also be included sothat phenylalanine (Phe) and isoleucine (Ile) can be encoded. Theelimination of the complimentary GA is without consequence since otherGG codon(s) encode argenine (Arg) and glycine (Gly). The elimination ofCC is then without consequence, since alanine (Ala), proline (Pro),serine (Ser) and threonine (Thr) can be encoded by CT-containing codons.This is the argumentation for the selection of ZZ sequences designated“combination A” below.

For the sake of completeness: if the doublet AA were left out and,consequently TT included, then AG must be included to encode Glu, Glnand Lys. In order to encode Ala and Pro, either CT (combination B) or CA(combination C) must now be included. This leads to the inclusion ofeither AG and CT (combi. B), or CA and TG (combi. C) as complimentarypairs. Combinations B and C thus do not represent an adequate solutionto the problem. combination A combination B combination C AA TT AA TT AATT AC GT AC GT AC GT AG CT AG CT AG CT CA TG CA TG CA TG CC GG CC GG CCGG GA TC GA TC GA TC

Sequences chosen are shown in bold type. Complementary pairs areadjacent to each other. TABLE 1 Genetic code; the selection of XZZcodons used according to combination A is shown in bold type. Ala ArgAsp Asn Cys Glu Gln Gly His Ile Leu GCA AGA GAC AAC TGC GAA CAA GGA CACATA TTA GCC AGG GAT AAT TGT GAG CAG GGC CAT ATC TTG GCGCGA                     GGG     ATT CTA GCTCGC                     GGT         CTC    CGG                                 CTG    CGT                                 CTT Lys Met Phe Pro Ser Thr TrpTyr Val Stop AAA ATG TTC CCA AGC ACA TGG TAC GTA TAA AAG     TTT CCC AGTACC     TAT GTC TAG             CCG TCA ACG         GTG TGA            CCT TCC ACT         GTT                 TCG                TCT

TABLE 2 Frequency of the amino-acids, comparing the selected combinationA (above) and the natural frequency of all codons. Amino-acid naturalfrequency Combination A Ala 4 1 Arg 6 2 Asp 2 1 Asn 2 1 Cys 2 0 Glu 2 1Gln 2 1 Gly 4 1 His 2 1 Ile 3 1 Leu 6 3 Lys 2 1 Met 1 1 Phe 2 1 Pro 4 1Ser 6 1 Thr 4 1 Trp 1 1 Tyr 2 1 Val 4 2 Stop 3 1 Total 21 64 24

Creation of a set of four oligonucleotides according to combination A

Gene libraries can be created according the requirements of thecombination A, by creating four sets of nucleotides in whichX_(n+a)Z_(n+a+1)Z_(n+a+2) are:

i) NGG

ii) NCT

iii) NA (A or C)

iv) NT (C or G),

where N is C, G, A or T.

After the synthesis of these oligonucleotides they can be combined toobtain a single-tube cosmix-plexing gene library, whereby to obtain therelative codon frequencies given in Table 2 the gene libraries i) to iv)are present in the final mixture at a ratio of 1:1:2:2, respectively. Asexplained above this mixture will always give a correct orientation onreligation of type IIS restriction enzyme-cleaved fragments having the 2bp single-stranded cohesive ends ZZ.

Alternatively: a set of six oligonucleotides conforming to combination A

Gene libraries can be created according a modification of combination A,in which both Stop and cystine codons are eliminated, and in which eachof the other amino-acids is each represented by a single codon, bycreating six sets of nucleotides in which X_(n+a)Z_(n+a+1)Z_(n+a+2) are:

i) (A, G or T) GG or (C, G or T) GG

ii) NCT

iii) (A, G or C) AA

iv) NAC

v) NTC

vi) ATG

After the synthesis of these oligonucleotides they can be combined toobtain a single-tube cosmix-plexing gene library, whereby to obtain theequimolar codon frequencies for each amino-acid the gene libraries i) tovi) are present in the final mixture at a ratio of 3:4:3:4:4:4:1respectively. As explained above this mixture will always give a correctorientation on religation of type IIS restriction enzyme-cleavedfragments having the 2bp single-stranded cohesive ends ZZ.

Again, as with the previous sets this single-tube library representssix-subsets which are unable to recombine with each other duringcosmix-plexing.

Consideration of the central amino-acid codon created duringcosmix-plexing recombination

The amino-acid at the recombination site is determined by the5′-hypervariable segment. The set of amino-acids which may berepresented at this position is defined for each subset as presented inTable 2.

Consideration of the number of clones needed in a “representative”library

The minimal number of clones required in a library to include allpossible amino-acid sequences in a random peptide containing ‘n’amino-acids is 20^(n), i.e. for n=9, 20⁹=5.12×10¹¹. In fact, at aconfidence limit of say 95%, this figure must be some three-fold higher,to allow for the statistics of sampling, i.e. ca. 1.5×10¹². In practicethis figure may be higher due to, e.g. non-random synthesis of theoligonucleotides used to generate the library as well as biased codonPepresentation (for a detailed discussion see Collins 1997).

Consideration of the number of recombined clones generated bycosmix-plexing

The cosmix-plexing strategy is based on the concept that in initialselection experiments clone populations will be enriched for sequenceswhich contain structural elements based on the primary sequence in thevaried segment. Even if the optimal sequence is not present due to thelimitations imposed by the limited size of the initial library,cosmix-plexing will increase the likelihood of finding just such asequence by providing a large number of novel recombinants in which the5′- and 3′-“halves” of the varied section are reassorted e.g. for thehypervariable nonapeptide library described in the example, thesequences encoding the amino-proximal five amino acids are recombinedwith the sequences encoding the carboxy-proximal four amino-acids. Sincethe cohesive ends essentially limit the recombination to definedsubsets, in which one subset cannot undergo recombination with any ofthe other subsets, the actual number of recombinants generated is lessthan could be obtained with completely random recombination.

For the initial four-tube protocol described, four separate librarieseach containing four subsets are used:

Random recombination would generate, for a set of N clones, N²recombinants, assuming N² is less than or equal to the theoreticalnumber of variants (20^(n), see above) which can be encoded within thehypervariable segment, otherwise it will tend to 20^(n).

For the four-tube protocol 16 subsets are created each representing apool within which recombination can take place. If the total the libraryconsists of N clones then the number of novel recombinants which can beformed within each of the 16 subsets is (N/16)². Summing for all sixteensubsets, the number of recombinants which can be generated is16×(N/16)²=N²/16, again assuming N²/16 is less than or equal to thetheoretical number of variants (20^(n), see above) which can be encodedwithin the hypervariable segment, otherwise it will tend to 20^(n).

For the single-tube protocol only 6 subsets are created, eachrepresenting a pool within which recombination can take place. If thetotal library consists of N clones then the number of novel recombinantswhich can be formed within each of the 6 subsets is (N/6)². Summing forall six subsets, the number of recombinants which can be generated is6×(N/6)²=N²/6, again assuming N²/6 is less than or equal to thetheoretical number of variants (20^(n), see above) which can be encodedwithin the hypervariable segment, otherwise it will tend to 20^(n).

It is thus clear that the single-tube version of the invention issuperior not only in terms of time and economy of the procedure but inthe potential to generate a greater diversity from a given number ofclones during cosmix-plexing guided recombination.

A specific embodiment of the invention concerns a bank of genes, whereinthe six sets of oligonucleotide sequences are present at a ratio of(i):(ii):(iii):(iv):(v):(vi) of (0 to 1):(0 to 1): (0 to 1): (0 to 1):(0 to 1): (0 to 1) with the proviso that at least one of said sets ispresent.

Further, a specific embodiment of the invention concerns a bank of geneswherein each gene is provided as display vector, especially as M13 phageor M13-like phage or as phagemid.

Further, a specific embodiment of the invention concerns a bank of geneswherein the double stranded DNA sequence is comprised by a DNA region(fusB) encoding a peptide or a protein to be displayed.

Further, a specific embodiment of the invention concerns a bank ofgenes, characterized in that n=j=6, a=14 and b=16.

Further, a specific embodiment of the invention concerns a bank of geneswherein the restriction enzyme is a type IIS restriction enzyme.

Further, a specific embodiment of the invention concerns a bank of geneswhich is characterized in-that

(a) subsequence B₁ . . . B_(n) is the recognition site for therestriction enzyme BpmI (CTGGAG) and subsequence Q_(n+a+b+1) . . .Q_(n+a+b+j) is an inverted BsgI recognition site (CTGCAC); or (b)subsequence B₁ . . . B_(n) is the recognition site for the restrictionenzyme BsgI (GTGCAG) and subsequence Q_(n+a+b+1) . . . Q_(n+a+b+j) is aninverted BpmI recognition site (CTCCAG).

Further, a specific embodiment of the invention concerns a bank of geneswhich is characterized in that the hypervariable sequence X_(n+1) . . .X_(n+a+b) contains NNB or NNK wherein N=adenine (A), cytosine (C),guanine (G) or thymine (T);

B=cytosine (C), guanine (G) or thymine (T); and

K=guanine (G) or thymine (T).

Another embodiment of the invention concerns a phagemid pROCOS4/7 of thesequence shown in FIG. 6.

Still another embodiment of the invention concerns a phagemid pROCOS5/3of the sequence shown in FIG. 7.

Another embodiment of the invention concerns a method for the productionof large

phage-display libraries or

phagemid-display libraries,

containing or consisting of optionally packaged recombined displayvectors, wherein recombination takes place at the cleavage site(s) for arestriction enzyme (cut (B) enzyme; arrow in FIG. 3) and wherein

(a) to (b) a double-stranded DNA prepared from Escherichia coli cellscontaining a display vector population, consisting ofM13 phages orM13-like phages or consisting of phagemids according to the invention; acosmid vector; a restriction enzyme for cut (B); and a restrictionenzyme for cut (A) are selected, wherein

(i) the cut (B) enzyme cleaves the display vectors in the regionencoding the displayed peptide or displayed protein (arrow in FIG. 3)and generates unique non-symmetrical cohesive ends, wherein eachcohesive end is a 2 bp single strand end formed by the two basesdesignated Z, and

(ii) the cut (A) enzyme cleaves the display vectors and the cosmidvector and generates upon cleavage unique non-symmetrical cohesive ends(fusA) which differ from those resulting from cut (B),

(c) the display vectors are cleaved with the first restriction enzyme,

(d) the display vector and the cosmid vector are cleaved with the secondrestriction enzyme,

(e) the cleaved display vectors are ligated with the cleaved cosmidvectors forming concatamers,

(f) the ligation product is subjected to a lambda packaging andtransduced into an Escherichia coli host,

(g) if wanted, selection is made for a gene present in the ligateddisplay vectors,

(h) the transduced display vectors in the Escherichia coli host are

either in the case of a phage-display vector spontaneously packaged inM13 or M13-like phage coats

or in the case of a phagemid-display vector packaged by infecting theEscherichia coli host with an M13 type helper phage (superinfection),

(i) the packaged display vectors are passaged in a fresh Escherichiacoli host and phage-display or phagemid-display libraries are formedand, if wanted,

(j) the passaged display vectors are

either in the case of a phage-display vector spontaneously packaged inM13 or M13-like phage coats

or in the case of a phagemid-display vector packaged by infecting thefresh Escherichia coli host with an M13 type helper phage(superinfection) and

phage-display or phagemid-display libraries are formed.

A specific embodiment of the invention concerns a method which ischaracterized in that in steps (a) to (b) a type IIS restriction enzymeis selected, preferably BglI, DraIII, BsgI or BpmI.

Further, a specific embodiment of the invention concerns a method whichis characterized in that for cuts (B) and (A) the same restrictionand/or restriction enzyme is selected.

Further, a specific embodiment of the invention concerns a method whichis characterized in that as cut (B) enzyme and as cut (A) enzymedifferent enzymes are used (FIG. 3), preferably BsgI or BpmI as cut (B)enzyme and DraIII as cut (A) enzyme (fd or M13 replication origin cut).

Further, a specific embodiment of the invention concerns a method whichis characterized in that in step (h) and facultatively in step (j)M13K07 is used as M13 type helper phage.

Further, a specific embodiment of the invention concerns a method whichis characterized in that the phagemid and the cosmid are identical and,further, presence of and cleavage with cut (A) enzyme is optional and/orcut (B) enzyme and cut (A) enzyme are identical.

Further, a specific embodiment of the invention concerns a method whichis characterized in that in step (i) the multiplicity of infection (MOI)is less than or equal to 1.

Further, a specific embodiment of the invention concerns a methodwherein the cosmid comprises an fd or M13 bacteriophage origin(replication/packaging).

Further, a specific embodiment of the invention concerns a methodwherein in step (e) a mol ratio of display vectors to the cosmid vectorwithin the range of from 3:1 to 15:1 and preferably 3:1 to 10:1 is used.

Further, a specific embodiment of the invention concerns a methodwherein in step (e) a vector concentration (comprising display vectorsand cosmid vectors) of more than 100 μg DNA/ml is used.

Another embodiment of the invention concerns a method for the productionof large

phage-display extension libraries or

phagemid-display extension libraries, wherein

an oligonucleotide cassette of d bases in length is inserted into arestriction site (cut (B)) via the cohesive ends ZZ as defined above toyield a sequence (supra sequence) or a gene comprising a double strandedDNA sequence which is represented by the following formula of one oftheir strands:5′B₁ . . . B_(n)X_(n+1) . . . X_(n+a+d)Z_(n+a+d+1)Z_(n+a+d+2)X_(n+a+d+3). . . X_(n+a+d+b)Q_(n+a+d+b +1) . . . Q_(n+a+d+b+j)3′

wherein d is an integer and a multiple of 3, preferably within the rangeof from 6 to 36; n, a, b andj and B, X, Z and Q have the same meaning asin any of the preceding claims; and wherein

(a) to (b) a double-stranded DNA prepared from Escherichia coli cellscontaining a display vector population, consisting of M13 phages orM13-like phages or consisting of phagemids according to the invention; acosmid vector; a restriction enzyme for cut (B); and a restrictionenzyme for cut (A) are selected, wherein

(i) the cut (B) enzyme cleaves the display vectors in the regionencoding the displayed peptide or displayed protein and generates uniquenon-symmetrical cohesive ends; wherein each cohesive end is a 2 bpsingle strand end formed by the two bases designated Z,

(ii) the cut (A) enzyme cleaves the display vectors and the cosmidvector such that unique non-symmetrical cohesive ends are formed whichdiffer from those resulting from cut (B),

(c1) the display vectors are cut with the cut (B) restriction enzyme,

(c2) a DNA cassette is inserted into the cleavage site with their ZZcohesive ends,

(d) the resulting display vector and the cosmid vector are cleaved withthe cut (A) restriction enzyme,

(e) the cleaved display vectors are ligated with the cleaved cosmidvectors forming concatamers,

(f) the ligation product is subjected to a lambda packaging andtransdticed into an Escherichia coli host such that the DNA cassettelies between two hypervariable sequences (extension sequences),

(g) if wanted, selection is made for a gene present in the ligateddisplay vectors,

(h) the transduced display vectors in the Escherichia coli host are

either in the case of a phage-display vector spontaneously packaged inM13 or M13-like phage coats

or in the case of a phagemid-display vector packaged by infecting theEscherichia coli host with an M13 type helper phage (superinfection),

(i) the packaged displayveetors are passaged in a fresh Escherichia colihost and phage-display or phagemid-display libraries are formed, and, ifwanted,

(j) the passaged display vectors are

either in the case of a phage-display vector spontaneously packaged inM13 or M13-like phage coats

or in the case of a phagemid-display vector packaged by infecting thefresh Escherichia coli host with M13 type helper phages (superinfection)and

phage-display or phagemid-display extension libraries are formed.

Another embodiment of the invention concerns a method for thereassortment of the 5′-and/or 3′-extensions in the production of largerecombinant

phage-display extension libraries or

phagemid-display extension libraries,

comprising the sequence as defined before wherein recombination takesplace at one or the other, or consecutively at both the cleavage site(s)ZZ bracketting the inserted cassette(s), wherein

(a) to (b) a double-stranded DNA prepared from Escherichia coli cellscontaining a display vector population, consisting of M13 phages orM13-like phages or consisting of phagemids as display vectors as definedbefore; a cosmid vector; a restriction enzyme for cut (B); andrestriction enzyme for cut (A) are selected, wherein

(i) the cut (B) enzyme cleaves the display vectors in the regionencoding the displayed peptide or displayed protein and generates uniquenon-symmetrical cohesive ends at selectively either

the 5′-junction of extension and cassette (cleavage by the restrictionenzyme recognizing the binding site B. B, as defined before), or

at the 3′-junction of extension and cassette (cleavage by therestriction enzyme recognizing the binding site Q_(n+a+b+1) . . .Q_(n+a+b+j) as defined before, or Q_(n+a+d+b+1) . . . Q_(n+a+d+b+j) asdefined before), wherein each cohesive end is a 2 bp single strand endformed by the two bases designated Z,

(ii) the cut (A) enzyme cleaves the display vectors and the cosmidvector and generates upon cleavage unique non-symmetrical cohesive endswhich differ from those resulting from cut (B),

(b) the display vectors are cleaved with the first restriction enzyme,

(c) the display vector and the cosmid vector are cleaved with the secondrestriction enzyme,

(e) the cleaved display vectors are ligated with the cleaved cosmidvectors forming concatemers,

(f) the ligation product is subjected to a lambda packaging andtransduced into an Escherichia coli host,

(g) if wanted, selection is made for a gene present in the ligateddisplay vectors,

(h) the transduced display vectors in the Escherichia host are

either in the case of a pha ge-display vector spontaneously packaged inM13 or M13-like phage coats

or in the case of phagemid-display vectors packaged by infecting theEscherichia coli host with an M13-type helper bacteriophage(superinfection),

(i) the packaged display vectors are passaged in a fresh Escherichiacoli host and phage-display or phagemid-display libraries are formedand, if wanted

(j) the passaged display vectors are

either in the case of a phage-display vector spontaneously packaged inan M13 or M13-like phage coats

or in the case of a phagemid vector packaged by infecting the freshEscherichia coli host with M13 type helper phages (superinfection) andphage-display or phagemid-display libraries are formed.

A specific embodiment of the invention concerns a method which ischaracterized in that in steps (a) to (b) a type IIS restriction enzymeis selected, preferably BglI, DraIII, BsgI or BpmI.

Further, a specific embodiment of the invention concerns a method whichis characterized in that for cuts (i) and (ii) the same restriction siteis selected.

Further, a specific embodiment of the invention concerns a method whichis characterized in that as cut (B) enzyme and as cut (A) enzymedifferent enzymes are used, preferably BsgI or BpmI as cut (B) enzymeand DralIl as cut (A) enzyme (fd or M13 replication origin is cut).

Further, a specific embodiment of the invention concerns a method whichis characterized in that in step (h) and facultatively in step (j)M13K07 is used as the M13-type helper phage.

Further, a specific embodiment of the invention concerns a method whichis characterized in that in step (g) selection is made for the presenceof an antibiotic resistance gene.

Further, a specific embodiment of the invention concerns a method whichis characterized in that in step (i) the multiplicity of infection (MOI)is less than or equal to 1.

Further, a specific embodiment of the invention concerns a methodwherein the cosmid comprises an fd or M13 bacteriophage origin.

Further, a specific embodiment of the invention concerns a methodwherein in step (e) a mol ratio of display vectors to the cosmid vectorwithin the range 3:1 to 15:1 and preferably 3:1 to 10:1 is used.

Further, a specific embodiment of the invention concerns a methodwherein in step (e) a vector concentration (comprising display vectorsand cosmid vectors) of more than 100 μg DNA/ml is used.

Another embodiment of the invention concerns a method for the de novoproduction of large

phage-display libraries or

phagemid-display libraries,

comprising DNA sequences as defined before, and subjectable torecombination according to a procedure as defined before, whereinrecombination takes place within a DNA sequence as defined before,wherein

a) a display vector, consisting of an M13 phage or M13-like phage orconsisting of a phagemid-display vector comprising a bacteriophagereplication origin. facultatively a gene for a selectable marker,preferably an antibiotic resistance, a lambda bacteriophage cos-site anda “stuffer”-sequence (FIG. 5 upper right), containing two binding sitesfor a type IIS restriction enzyme different from any of the enzymes asdefined before (cut (B) and cut (A)), wherein said two sites areoriented in divergent orientation and where the cohesive ends generatedon cleavage are non-symmetrical and differ from one another at the twosites, and

b) a PCR-generated fragment comprising part of one of the sequences asdefined before, including a (the) hypervariable sequence(s), preferablyX_(n+1). . . X_(n+a)Z_(n+a+1)Z_(n+a+2)X_(n+a+3) . . . X_(n+a+b)according to the invention, bracketted by the same type IIS restrictionenzyme binding sites defined in (a), but in this case both orientedinwards towards the hypervariable sequence (FIG. 5 left side) and whereon cleavage by this restriction enzyme two non-symmetrical, singlestrand ends different from one another are generated, where the firstend (a' in FIG. 5) is complementary to one of the ends (a in FIG. 5)generated on the large vector fragment in (a) and the second end (b' inFIG. 5) is complementary to the other end (b in FIG. 5) generated on thelarge vector fragment in (a),

c) the two cleavage reaction systems (a) and (b) still containing theactive type IIS restriction enzyme are mixed together in approximatelyequimolar proportions and subjected to ligation in the presence of DNAligase;

fragments containing the restriction enzyme binding sites are constantlyremoved (“stuffer” fragment and outer end of the PCR product) whereasthe other two components, namely the large vector fragment and theinsert sequence (central fragment from the PCR reaction) are driven toform

A) a concatameric hybrid if the ligation is carried out at>100 μg DNA/ml(FIG. 5), or

B) a circular hybrid if the ligation is carried out at<or=40 μg DNA/ml,

d1) in the case of protocol A) the DNA is packaged into lambda particlesand transduced into an Escherichia coli host,

d2) in the case of protocol B) the DNA is transformed in an Escherichiacoli host,

e) if wanted, selection ismade for a gene present in the ligated displayvectors,

f) the transduced display vectors in the Escherichia coli host are

either in the case of a phage-display vector spontaneously packaged inM13 or M13-like phage coats

or in the case of phagemid-display vectors packaged by infecting theEscherichia coli host with an M13-type helper bacteriophage(superinfection),

(g) the packaged display vectors are passaged in a fresh Escherichiacoli host and phage-display or phagemid-display libraries are formedand, if wanted

(h) the passaged display vectors are

either in the case of a phage-display vector spontaneously packaged inan M13 or M13-like phage coats

or in the case of a phagemid vector packaged by infecting the freshEscherichia coli host with M13-type helper phages (superinfection) and

phage-display or phagemid-display libraries are formed.

A specific embodiment of the invention concerns a method which ischaracterized in that in steps (a) to (b), as type IIS restrictionenzyme, preferably BpiI, BsgI or BpmI is selected.

Further, a specific embodiment of the invention concerns a method whichis characterized in that in step (f) and facultatively in step (h)M13K07 is used as the M13-type helper phage.

Further, a specific embodiment of the invention concerns a method whichis characterized in that in step (e) selection is made for the presenceof an antibiotic resistance gene.

Further, a specific embodiment of the invention concerns a method whichis characterized in that in step (g) the multiplicity of infection (MOI)is less than or equal to 1.

Another embodiment of the invention concerns a phage-display library ora phagemid-display library in the form of packaged particles obtainableaccording to any of the methods as described before.

Another embodiment of the invention concerns a phage-display library ora phagemid-display library in the form of display vectors comprised byEscherichia coli population(s) obtainable according to any of themethods as described before.

Another embodiment of the invention concerns a phage-display librariesor phagemid libraries which are characterized by a gene (genes) asdefined before and obtainable according to the invention, wherein theterm “large” as used before is defined as in excess of 10⁶ variantclones, preferentially 10⁸ to 10¹¹ variant clones.

Finally, another embodiment of the invention concerns a protein orpeptide comprising a peptide sequence encoded by a DNA sequence asdefined before and obtainable by affinity selection procedures on adefined target by means of libraries as defined before.

Detailed Description

The invention pertains to a novel combination of recombinant DNAtechnologies to produce large hypervariable gene banks for the selectionof novel ligands of pharmaceutical, diagnostic, biotechnological,veterinary, agricultural and biomedical importance with an efficiencyhigher than was hitherto attainable.

The size of the hypervariable gene bank is presently considered the mostessential factor limiting the usefulness of the methodology for suchpurposes, since, as an empirical method, it depends on the diversity(number of different variants) initially generated in the bank(hypervariable gene library). In contrast to this traditional opinion weconsider that, when a highly efficient method is developed, as presentedhere, to generate a large proportion of the possible combinations ofmutated segments of the variants from a preselected subpopulation, apopulation enriched for the desired structural elements will begenerated which would only have been represented in a populationapproaching N^(x) where N is the size of the original population and xis the number of segments to be recombined.

The first part of the invention pertains to novel sequences which allowrecombination within hypervariable DNA sequences encoding regions(domains) variable peptides or proteins displayed in combinatorialphage/phagemid display libraries using type IIS restrictionendonucleases both (a) to introduce a cut at the site of recombinationand (b) to generate oriented substrates for a ligation reaction, wherethe ligation products are then recloned at high efficiency after invitro packaging in a lambda packaging mix. The entire protocol yieldsefficiencies (clones per input DNA) in excess of any describedtechnology (>10⁸ clones per microgram ligated DNA).

Combinations of (vector) sequences and protocols are claimed for boththe production of the initial libraries and for recombinationalprocedures to generate increased diversity within the library or aselected subpopulation at any time. In particular such sequences andprocedures are claimed for the generation and use ofphage/phagemid-display combinatorial libraries.

The inventors recognize that the main factor thereby determining theefficient generation of further variation is the efficient production ofcombinatorial libraries from the initial libraries, via reassortment ofsmaller elements (specific peptide sequences within the hypervariableregion, and/or reassortment of structural domains) which contribute tothe properties selected for. The invention presents such a method, whichhas the unique property that the recombination site may be within thehypervariable region whereby no restriction is imposed on the sequencewithin the hypervariable region involved. Alternatively the method canbe used to reassort domains of proteins or subunits of heteromericproteins (proteins composed of two or more different variant polypeptidechains), each of which can contain hypervariable regions, withoutresorting to recloning isolated DNA fragments or generating newlibraries containing new synthetic oligonucleotides. It is noted thatthis method thus offers a saving in both time and materials whenoptimizing a structure for a predetermined property on the basis of apreselected clone population (subpopulation) and in view of thegeometrical increase in possible variability offered may represent aqualitatively novel feature in that some rare structures may beobtainable only by the novel strategy described.

The method, we designate cosmix-plexing⁷, is based on the design of thecloning vectors, the inserts used and a combination of specialrecombinant DNA protocols, which in particular use i) cleavage of thephage/phagemid DNA with type IIS restriction enzymes, ii) subsequentligation to concatamers which are iii) packaged in vitro with a lambdapackaging system for iv) efficient transduction into E.coli strains,where they are then v) repackaged in vivo in filamentous phage coats.The use of cosmix-plexing⁷, so defined, on a heterogeneousphage/phagemid population generates an enormous increase in novelvariants at any time during further experimentation, e.g. after anyenrichment step for structures having the predetermined property orproperties.

In particular subpopulations which are enriched from the originallibrary for a specific property will be enriched for a consensus motif(a degenerate set of related sequences within the varied region(s) whichall exhibit the required property to some extent) which may (probablywill) include the optimal sequence in terms of the required property.Reassortment of these regions or portions of a single hypervariablesequence by cosmix-plexing⁷ will increase the probability of obtainingthe optimal sequence. The subpopulations may be isolated by differentialaffinity-based selection on a defined target, or enrichment proceduresbased on other desired selectable properties (example 1: substrateproperties such as phosphorylation by a particular protein kinaseenriched by binding on antibodies which recognize the modified (in thiscase phosphorylated)substrate; or example 2: cleavage of the variantsequence by an endoprotease, using selective release of the phage orphagemid previously bound via an interaction between a terminal proteinstructure (anchor) and its ligand immobilized to, or later trapped on, asurface).

The invention further covers the generation of extension libraries inwhich e.g. a “project-specific cassette” is inserted at therecombination site within the gene bank. Optimisation of ligands canthen occur by the generation of further combinatorial libraries fromselected clones in which the adjacent regions may be efficiently“shuffled”, either singly or both at a time. As far as we are aware noother system provides this “cassette” insertion/exchange” feature.

BRIEF DESCRIPTION OF THE DRAWINGS:

FIG. 1. Diagrammatic representation of the steps involved in creatingrecombination within the hypervariable regions of cosmix-plexing⁷libraries.

Double-stranded phagemid from a number of clones (which may be a cosmiditself) and cosmid DNA (if the phagemid is not a cosmid) are cleavedwith a type IIS restriction enzyme (cleavage sites indicated by a smallbar) within the hypervariable region and ligated together at high DNAconcentration so that long concatemers of the DNA molecules are formed,which are all oriented in the same direction, e.g with respect to theM13 packaging origins, i.e. no palindromic regions are formed. Thevectors contain one or more restriction site(s) for the type IISrestriction enzyme such that no cohesive ends are formed which onligation could form palindromic (i.e. head-to-head or tail-to-tail)structures. When the cohesive ends produced on cleavage by therestriction enzyme are themselves non-palindromic and unique to eachrestriction site within each plasmid/phagemid, only ring closure and theformation of concatemers can be formed. At higher DNA concentrations(i.e. over 200 jig/ml) concatemer formation will be preferred. A moredetailed presentation of the molecular structures formed is given inFIGS. 2 and 3. The ligation product is added to an in vitro lambdapackaging extract where the DNA is packaged into a lambda bacteriophagecoat as a linear DNA of 37 to 50 kb cleaved at a lambda cos-site. In thefollowing step, referred to as transduction, these particles carryingthe cosmid-phagemid hybrid DNA are added to Escherichia coli cells(shown as large ellipses in the diagram) into which the DNA injectsitself. In the cell it is circularized by closure of the cleavedcos-site using the endogenous DNA ligase. It is then propagated as alarge cosmid-phagemid hybrid, replicating from the plasmid DNAreplication origin(s). M13-type helper phage (e.g. M13K07) is added tothese cells in the step referred to as superinfection. On entry of thehelper phage single strand replication is initiated from the M13replication origins present in the individual copies of the phagemidcontained in the concatemer. During this process the phage are alsopackaged into M13 coats, and secreted into the medium. The phagemid canbe harvested from the supernatant of the culture. A second passage, i.e.transduction into an E.coli host and repackaging by superinfection withhelper phage is necessary before these phagemid are used in a selectionprocedure in order to ensure that a particular variant protein ispresented only on the particle carrying the gene for that particularvariant protein. It is noted that this is a highly efficient process inwhich a yield of more than 108 different phagemid can be produced promicrogram of ligated input DNA.

FIG. 2. The diagram illustrates the DNA structures formed when thecosmix-plexing⁷ protocol is carried out as shown in FIG. 1. Differentvariants are designated by different patterns for the whole plasmid.Initially double-stranded DNA is cleaved with a type IIS restrictionenzyme A. The ligation product is illustrated as a concatemer in whicheach phagemid is oriented in the same orientation. The products of 37 to50kb introduced after in vitro lambda packing and introduction into theE.coli cells (shaded ellipses) are shown, whereby, for example 8 to 10copies of a 4.5 kb phagemid may be present per cell. On_repackaging thesame phagemid are obtained as were present before cleavage and ligation.The protocol as shown here in which the M13-packaging/replication siteand the restriction site for enzyme A are identical, is simply anefficient method of amplification when starting with double strandedDNA.

FIG. 3. The diagram illustrates a variant of the protocol illustrated inFIGS. 1 and 2 in which recombination is achieved between differentphagemid variants. The cross-over point for the recombination is thecleavage site for the type IIS restriction enzyme B (shown as a hollowarrow) cleaving preferentially within a hypervariable region or betweentwo different variable regions (see also FIG. 4, where additionalcleavage sites within other variable regions may be recombinedsimultaneously). Again, as mentioned in the FIG. 1 legend, each phagemidmay be a cosmid itself, in which case the addition of another cosmid isunnecessary. In this example cleavage with the restriction enzyme A isoptional. Although FIGS. 2 and 3 are almost identical it should be notedthat the products of the scheme in FIG. 3 are all recombined, i.e.hybrids of the two sides of different variants. Repassaging is neededbefore use in the recombined library for selection experiments for thesame reasons discussed in the previous two Figures.

FIG. 4. Cosmix-plexing7 strategies.

The left part of the figure shows the hypervariable DNA sequencesencoding the variable portion of the peptide or protein presented on thephage/phagemid. The four bars designated ‘N variants’ show that thereare different sequences on either side of the type IIS restrictioncleavage site. Phagemid DNA from the variant clones can be cleaved withthe type IIS restriction enzyme and religated to yield the indicatednumber of recombinant clones, within the limits of the cloningefficiency. If one starts with a subpopulation of preenriched variantsfrom the primary library (say 4×10⁴ clones) then one-sixteenth of allpossible recombinants (10⁸) can be obtained.

The construction of “extension libraries” is shown below the dottedline. In this case a project-specific cassette containing a biased codondistribution encoding some sequence elements previously defined asadvantageous for binding to the target is inserted into thehypervariable sequence at the type IIS restriction cleavage site. Thelarge library thus generated encodes a protein containing three segments(domains B, A and C), whereby the central domain A is encoded by theproject-specific cassette, and is bordered by the hypervariable domainsB and C.

The formnuli for the numbers of variants obtained are made for theprotocol in which four separate libraries are constructed.

The right side of the figure illustrates how the variant protein mightbind to a target protein. The variants selected from the extensionlibrary are expected to have a larger surface of interaction and thus toexhibit stronger and/or more specific binding to the defined target. Thetarget may be a cell, a (partially) purified protein or peptide e.g.enzyme, antibody, hormone or lymphokine, cell receptor or in fact anydefined surface or particle suspension, possibly coated with one of theaforementioned targets, which is amenable to physical separation, i.e.the wall of a receptacle (tube, tubing, flask, microtiter plate, aplanar surface), or a particle (e.g. beads, magnetic beads, or dropletsin a two-phase liquid system).

FIG. 5: Driven directed cloning (DDC)

This figure illustrates an example of a cloning protocol which hasexcellent properties for the highly efficient construction ofhypervariablc libraries and extension libraries, which can be used withthe cosmix-plexing7 method. The left side of the figure shows thepreparation of the hypervariable cassette to be inserted into thecosmid-phagemid double-stranded vector. The cosmid-phagemid vectorcontaining a “stuffer fragment” is shown on the right. Both thePCR-product containing the hypervariable sequence, shown as a line ofasterisks, and the vector containing the “stuffer” are cleaved with thesame type IIS restriction enzyme(s). It is noted that the recognitionsites for this (these) enzyme(s) are oriented in opposite directions,i.e. outwards from the stuffer in the case of the vector, and inwards inthe case of the PCR-product. After cleavage neither the hypervariablecassette to be inserted nor the vector contain any of the original typeIIS restriction enzyme recognition sites. The vectors and insert are,however, designed to have non-palindromic cohesive ends at theirtermini, generated by the restriction enzyme cleavage, so that aligation of insert and vector leads to an oriented insertion of thehypervariable region. In addition, the vector cannot undergo ringclosure in the absence of the insert cassette nor can the insertfragments ligate to one another. Since the ligation is carried out athigh DNA concentration and in the continued presence of the restrictionenzyme any ligation product resembling in the initial uncleaved orpartially cleaved vector or PCR-product will be immediately recleaved.This combination of oriented non-palindromic cohesive ends andrecleavage of unwanted ligation products, drives, especially at high DNAconcentration, where the formation of ring closure of avector-insert-hybrid is at a disadvantage, the formation of orienteddouble-stranded concatemers of the structure required for highlyefficient cosmid packaging. The primary cosmix-plexing library is formedfinally by transducing the packaged cosmid-phagemid hybrids into anE.coli host which contains, or is superinfected with, an M13-like helperphage. The phagemid are repassaged in a second M13 phage-packaging stepbefore use in selection so that individual phage clones are derived fromsingly infected cells. This is necessary in order that each phagemidparticle carries the variant encoded in its genome.

This is not the situation in the first packaging step in which theE.coli host contains a concatemer of some eight different variantphagemid.

Recombination can be achieved within the hypervariable region of thegene encoding the protein or peptide presented on the phagemid accordingto the scheme illustrated in FIG. 1. With extension libraries, eitherthe left (5′) or right (3′) extension, or both, can be reassorted bycleaving with a type [IS restriction enzyme recognizing a site borderingeither left end, the right end (opposite orientation), or both endsrespectively, as described for the sequences B₁-B_(n) and Q_(n+a+1) . .. Q_(n+a+i) in claim 3.

The use of hypervariable sequences in the description of the inventionimplies in general that we try to use set of oligonucleotides in which“randomized sequences” encode amino acids at ratios near to thatnormally found in natural proteins, whereby the frequency of stop-codonsis reduced. We are aware that for certain applications biased subsetsmay be preferable in the construction of dedicated sublibraries.

FIG. 6 The diagram in FIG. 6A shows a diagram of the phagemid pROCOS4/7.FIG. 6B through FIG. 6E shows the sequence of the phagemid pROCOS4/7(SEQ ID NO:17).

FIG. 7 The diagram in FIG. 7A shows a diagram of the phagemid pROCOS5/3.FIG. 7B through FIG. 7E shows the sequence of the phagemid pROCOS5/3(SEQ ID NO:18).

EXAMPLE 1

Cosmix-plexing using the four-tube method (according to claims 1-10)

1a) Library generation

Oligonucleotide Sequences: NONA-CA (SEQ ID NO 1): 5′ TCGG GGTACC TGGAGCA(XNN) 4KKN (XNN) 4         KpnI GCTGCACGG GAGCTC GCC 3′           SacINONA-CT (SEQ ID NO: 2): 5′ TCGGGGTACCTGGAGCA (XNN) 4RRN (XNN) 4GCTGCACGGGAGCTCGCC 3′ NONA-GA (SEQ ID NO: 3): 5′ TCGGGGTACCTGGAGCA (XNN)4YYN (XNN) 4 GCTGCACGGGAGCTCGCC 3′ NONA-GT (SEQ ID NO: 4): 5′TCGGGGTACCTGGAGCA (XNN) 4MMN (XNN) 4 GCTGCACGGGAGCTCGCC 3′

where X means: A, C and G; N: A, C, G and T; K: G and T; R:G and A; Y:CandT;M: C and A. NONA PCR-L (SEQ ID NO: 5): 5′ GGCGAGCTCCCGTGCAGC 3′NONA PCR-R (SEQ ID NO: 6): 5′ TCGGGGTACCTGGAGCA 3′

KpnI (GGTACC) and SacI (GAGCTC) restriction enzyme recognition sites aremarked in bold type.

important vector DNA-Sequences:            pROCOS4/7 (SEQ ID NO: 7):                      Eco47III       SacI     BsgI        KpnI                |pIII→ 5′GGCGAGCTCCCGTGCAGCGCTCCAGGTACCCCGATATCAGAGCTGAA, 3′                       BpmI            pROCOS4/7-Stuffer1 (SEQ ID NOS:8, 9):                   Eco47III    Eco47III       SacI    BsgI                       KpnI |pIII→ 5′GGCGAGCTCCCGTGCAGCGCT... ...AGCGCTCCAGGTACCCCGATATCAGAGCTGAA 3′                          -  -    BpmI                           ↑                   952 bp Eco47III fragment                    ofplasmid pBR322

KpnI (GGTACC), SacI (GAGCTC), BsgI (GTGCAG), Eco47III (AGCGCT) and BpmI(CTGGAG) restriction enzyme recognition sites are marked in bold type.The first codon of the mature pIII protein (GAA) is indicated.

For the generation of double-stranded DNA inserts the single-strandedhypervariable DNA oligos NONA-CA, NONA-CT, NONA-GA and NONA-GT areamplified using the single stranded DNA oligos NONA PCR-L and NONA PCR-Ras PCR-primers according to the following protocol:

Remark: the four hypervariable DNA-oligos have to be kept strictlyseparated! PCR-Amplification of DNA Oligos PCR-buffer (10X): KCl 500 mMTris-HCl (pH 9.0) 100 mM Triton X-100 1% Taq DNA polymerase (Promega) instorage buffer A: glycerol 50%  Tris-HCl (pH 8.0)  50 mM NaCl 100 mMEDTA  0.1 mM  DTT  1 mM Triton X-100 1% TE-buffer (1X) Tris-HCl (pH 8.0) 10 mM EDTA  0.1 mM 

1. Transfer 2 μl of a 10 pmol/μl solution of the hypervariable oligosNONA-CA, -CT, -GA and -GT in a 0.2 ml PCR reaction tube (4 tubes).

2. Mix the following in one Eppendorf reaction tube: ddH₂O 276.75 μl  PCR-buffer (10X) 45.0 μl  NONA PCR-L (100 pmol/μl) 9.0 μl NONA PCR-R(100 pmol/μl) 9.0 μl dNTPs (10 mM each) 9.0 μl Taq DNA polymerase (5U/μl) 2.25 μl 

3. Transfer 78 μl of this mixture to each of the PCR tubes containingthe hypervariable oligos (step 1).

4. Mix 45 μl MgCl₂ (25 mM) and 45 μl ddH₂O in an Eppendorf reactiontube.

5. Preheat a PCR thermocycler to 94° C. (if possible use a heated lid).

6. Transfer 20 μl of the MgCl₂ solution (step 4) into each of the PCRtubes (step 3).

7. Put the tubes directly into the thermocycler (simplified hot-start)and run the following program: 1. 94° C. 30 sec 2. 94° C. 10 sec 3. 52°C. 10 sec 4. repeat 9 times step 2 and 3 5. hold at 4° C.

For cloning the amplified oligo-DNA are cut with KpnI and SacI. Also thevector-DNA has to be cut with both enzymes. As vector-DNA pROCOS4/7 or aderivative thereof named pROCOS4/7-Stuffer1 which contains a DNA-Stufferfragment for easier control of the double digest reaction can be usedwithout any consequences regarding the final cloning results. Digestionsare done according to the following protocols: buffer B + TX-100 (1X)Tris-HCl (pH 7.5) 10 mM MgCl₂ 10 mM BSA 0.1 mg/ml Triton X-100 0.02%buffer A (1X) Tris-acetate (pH 7.9) 33 mM Mg-acetate 10 mM K-acetate 66mM Dithiothreitol 0.5 mM

Vector DNA Digestion

1. For the restriction digestion of the vector DNA with KpnI set up thefollowing mixture: pROCOS4/7-Stuffer1 X μl (200 μg) buffer B + TX-100(10X) 150 μl BSA (10 mg/ml = 100X) 15 μl KpnI X μl (400 U) ddH₂O to 1500μl

incubate at 37° C. for 3 hr and stop the reaction by incubating at 65°C. for 20 min.

2. Take an aliquot of 3 μl and run a 1% agarose gel with uncleaved DNAas a control.

3. Extract with phenol, precipitate with ethanol and resuspend the DNAin 820 μl TE-buffer.

4. Store a 20 μ aliquot of the digested DNA at −20° C. and mix thefollowing for the digestion with SacI: pROCOS4/7-Stuffer/KpnI 800 μlbuffer A (10X) 100 μl SacI X μl (400 U) ddH₂O 1000 μl

incubate at 37° C. for 3 hr.

5. Take an aliquot of 3 μl and run a 1% agarose gel using uncleaved andsingle-cut DNA as a control.

6. Extract with phenol, precipitate with ethanol and resuspend the DNAin 550 μl TE-buffer.

Oligo DNA Digestion

1. For the digestion of double-stranded (ds) oligo DNA with KpnI set upthe following four mixtures: NONA-CA, -CT, -GA or -GT dsDNA 100 μlbuffer B + TX-100 (10X) 50 μl BSA (10 mg/ml = 100x) 5 μl KpnI X μl (400U) ddH₂O to 500 μlNOTE:Don't heat up the oligo DNA.

2. Take an aliquot of 5 μl and run a 4.5% agarose gel with uncleaved DNAas a control.

3. Extract with phenol, precipitate with ethanol and resuspend the DNAin 110 μl TE-buffer.

4. Store a 10 μl aliquot of the digested DNA at −20° C. and set up thefollowing four mixtures for the digestion with SacI: NONA-CA, -GT, -GAor -GT/KpnI 100 μl buffer A (10X) 50 μl SacI X μl (400 U) ddH2O to 500μl

incubate at 37° C. for 5 hr.

5. Take an aliquot of 5 μl and run a 4.5% agarose gel using uncleavedand single-cut DNA as a control.

6. Extract with phenol, precipitate with ethanol and resuspend the DNAin 55 μl TE-buffer.

The vector-DNA fragment may be purified using the following protocol:

Purification of Vector DNA Fragments by Gel Extraction

1. To separate the pROCOS4/7 vector DNA fragment from the stufferfragment prepare a horizontal 1% agarose gel using a one-tooth combs.

2. Mix the DNA with 1/10 vol gel loading buffer, load onto the gel andelectrophorese at 100 V until both fragments are clearly separated.

3. Put the gel on the UV transilluminator and excise the 5.5 kbpROCOS4/7 vector DNA fragment.

4. Extract the agarose slice using the JETsorb gel extraction kit(Genomed GmbH, Germany).

Vector- and insert DNA fragments are ligated and transformed accordingto the following protocols:

Ligation of DNA Fragments

Check the integrity of vector and insert DNA fragments by agarose gelelectrophoresis (1% and 4.5% respectively). The concentration of theinsert DNA may be estimated by comparing its ethidium bromide stainingwith standards of known quantity like assembled oligonucleotides. Todetermine the vector DNA concentration determine the absorbance at260/280 nm. T4 DNA ligase buffer (1X): Tris-HCl (pH 7.5) 50 mM MgCl₂ 10mM Dithiotreitol 10 mM ATP 1 mM BSA 25 μg/ml

Test Ligation

1. To determine the appropriate ratio of insert to vector DNA a seriesof test ligations may be performed. For this assemble ligation reactionscomposed of: vector DNA fragment    X μl (0.5 μg) T4 DNA ligase buffer(10×)    1 μl ddH₂O to 9 μl

2. Prepare three twofold dilutions of the insert DNAs in ddH2O and add 1μl of undiluted DNA as well as 1 μl of each dilution to one of theligations reactions.

NOTE: The aim of this is to create vector to insert DNA (V/I) ratios of1:5 to 2:1.

3. Add 1 unit T4 DNA ligase to each reaction and incubate overnight at15° C.

NOTE: As a control one reaction without insert DNA and one withoutligase should be included.

4. Add 1 vol ddH2O to each reaction and incubate at 65° C. for 10 min.

5. Precipitate the DNA with ethanol and resuspend it in 10 μl TE buffer.

6. Transform electrocompetent E. coli JM110λ cells with the content ofeach tube and plate dilutions on ampicillin containing LB agar plates.

Large-Scale Ligation

1. To create the libraries set up four of the following mixtures: vectorDNA fragment X μl (¼ of the total prep.) insert DNA X μl (to create theoptimal V/I-ratio) T4 DNA ligase buffer (10×) X μl ({fraction (1/10)} ofthe final vol) T4 DNA ligase X μl (2 U/μg DNA) ddH₂O to create a DNAconc. of 0.05 μg/μl

incubate overnight at 15° C.

2. Extract with phenol, precipitate with ethanol and resuspend each ofthe ligation mixtures in sufficient TE-buffer to adjust the DNAconcentration to 0.1-0.2 μg/μl.

Preparation of Competent Cells

1. Inoculate 20 ml of LB medium with a single colony of E. coli JM110λand incubate at 37° C. and 180 rpm overnight.

2. Next day inoculate 2×1 liter of LB medium (2×21 Erlenmeyer flask) at1% with the overnight grown culture and incubate again at sameconditions until an optical density of OD₆₀₀=0.6 has been reached.

3. Transfer 250 ml aliquots of the culture into centrifuge tubes (GS3),chill the cells on ice and centrifuge for 15 min at 8000 rpm and 4° C.(Sorvall RC5C centrifuge; GS3 rotor). Decant the supernatant. 1

4. Resuspend each pellet in 250 ml of ice-cold ddH₂O, centrifuge again(step 3) and decant the supernatant.

5. Resuspend each pellet in 125 ml of ice-cold ddH₂O, collect each oftwo aliquots in one tube, centrifuge again (step 3) and decant thesupernatant.

6. Resuspend eachpellet in 10 ml of ice-cold sterile glycerol (10%),collect all of the aliquots in one GSA centrifuge tube, centrifuge for15 min at 8000 rpm and aspirate the supematant.

7. Resuspend the bacterial pellet in 10 ml of ice-cold sterile glycerol(10%).

8. Fill aliquots of 100 μl in precooled, sterile Eppendorf reactiontubes, freeze immediately in liquid nitrogen and store at −70° C.

Transformation of E. coli Cells by Electroporation

1. Place frozenraliquots of competent E. coli cells on ice and let themthaw.

2. To each aliquot add up to 2 μg DNA in less than 10 μl and incubate onice for 1 minute.

3. Fill the suspension in a prechilled electroporation cuvette (0.2 cmpathlength), place the cuvette-in the-electroporation sled and give apulse at a voltage of 2.5 kV, a capacity of 25 μF and a resistance of200Ω (Gene Pulser and Puls Controller, Bio-Rad).

4. Immediately add 1 ml of LB medium (supplemented with 20 mM Glucose),mix and transfer the suspension in an Eppendorf reaction tube.

5. Incubate for 1 hour at 37° C. and plate on LB agar plates containingampicillin (100 μg/ml). Incubate overnight at 37° C.

NOTE: To determine the size of the libraries also plate dilutions of thetransformed cells.

6. To create library stocks resuspend the cells in LB/ampicillin medium,mix with 1 vol of sterile 87% glycerol and store at −70° C.

1b) Recombination

For recombination within the hypervariable sequences according to thefour tube cosmix-plexing method the libraries can be preselected. Forthis purpose the E. coli cells containing the phagemid libraries aresuperinfected with M13K07 helper phages, progeny phages presentingfusion proteins are harvested and used for the first round of a panningaccording to standard methods e.g.: Preparation of M13K07-Phage StocksPEG/NaCl-solution: (16.7%/3.3 M)   100 g PEG 8000 116.9 g NaCl   475 mlH₂O PBS-buffer (1×):  8.0 g NaCl  0.2 g KCl  1.43 g Na₂HPO₄ + 2H₂O  0.2g KH₂PO₄ H₂O ad 1 1 pH 6.8-7

1. Use a disposable pasteur pipette to pick a single, well separatedM13K07 plaque from a E. coli WK6 lawn grown overnight on a LB/kanamycin(Km) plate, inoculate 20 ml of LB(2X)/Km medium (100 ml Erlenmeyerflask) with this agar slice and incubate overday at 37° C. on a shakerat 180 rpm.

2. Inoculate 2×500 ml LB(2X)/Km medium (in 2 l Erlenmeyer flasks) with10 ml preculture and incubate overnight (37° C., 180 rpm).

3. Next day centrifuge four 250 ml aliquots for 15 minutes at 8000 rpmand 4° C. (Sorvall RC5C centrifuge; GS3 rotor). Transfer the supernatantinto centrifuge bottles, centrifuge and transfer the supernatant againinto fresh centrifuge bottles.

4. Add 0.15 vol. of PEG/NaCl solution, mix and incubate on ice for atleast 2 hours.

5. Centrifuge for 60 min at 8000 rpm (GS3 rotor), decant thesupernatant, centrifuge for some sec at up to 4000 rpm and remove lasttraces of the supernatant using a pipette.

6. Resuspend each PEG-pellet in 2.5 ml PBS solution and collect theresuspended phages in one SS34 centrifuge bottle. To clear thesuspension centrifuge again for 10 min at 12000 rpm (SS34 rotor).Recover the supernatant (pipette), add NaN₃ to a final concentration of0.02% and store the phages at 4° C.

Packaging of Phagemids (keep each library separate!)

1. Inoculate 100 ml of LB/Amp medium (1 l Erlenmeyer flask) with 1 ml ofE. coli JM110λ cells containing phagemids (from overnight culture orresuspended cells) and incubate at 37° C. and 180 rpm until OD₆₀₀=0.5(˜2.5 h).

2. Add 500 μl M13K07 stock solution (10¹¹−10¹² cfu/ml), incubate at 37°C. for 15 min and continue shaking at 37° C. and 180 rpm overnight.

3. Next day centrifuge for 10 min at 8000 rpm (GSA rotor), decant thesupernatant into a fresh bottle and repeat the centrifugation step.

4. Add 0.15 Vol of PEG/NaCl solution and incubate on ice for at leasttwo hours.

5. Centrifuge for 60 min at 10000 rpm (GSA rotor), decant thesupernatant and repeat the centrifugation and remove the supernatantcompletely.

6. Dissolve the pellet in 1 ml of PBS buffer and transfer the solutioninto an Eppendorf reaction tube. Centrifuge for 10 min at 13000 rpm(batch centrifuge), recover the cleared solution and add NaN₃ (finalconcentration of 0.02%). Store at 4° C.

Panning Procedure (keep each library separate!)

T-PBS solution:

PBS-buffer containing 0.5% Tween 20

Blocking solution:

PBS-buffer containing 2% skim milk powder

Elution-buffer:

glycine (0.1 M; pH 2.2)

1. Coating of Microtiter Plates:

Fill 100 μl of ligand solution (100 μg/ml PBS) into the wells of a96-well microtiter plate (Nunc maxisorb) and incubate overnight at 4° C.or at least 2 hours at room temperature. Shake out the wells, slap theplate onto a paper towel and wash the wells once with T-PBS solution(ELISA plate washer or manually).

2. Blocking:

Fill the wells with 400 μl of blocking solution and incubate at roomtemperature for ˜1 hour. Shake out the wells, slap the plate onto apaper towel and wash the wells once with T-PBS.

3. Binding:

Fill the coated and one uncoated well (as a control) with 100 μl ofphage preparations diluted 1:1 with skim milk powder (usually ˜10¹⁰−10¹¹phages/well) and incubate at room temperature for 1 to 3 hours.

4. Washing:

Remove the solutions using a pipette and slap the plate onto a papertowel.

In the first round of panning wash the wells once with T-PBS, incubatefor 10 min with 400 μl blocking solution, wash again with T-PBS andfinally two times with water. During all further rounds repeat the T-PBSwashing steps three times. All washing steps can be carried out manuallyusing a pipette or with an ELISA plate washer.

5. Elution:

Slap out the plate and fill the wells with 100 pl of elution-buffer,incubate at room temperature for 15 min and transfer the solution intoan Eppendorf reaction tube containing 6 μl Tris (2 M).

6. Determine the titer of eluted phages as described under 3.1.3.

Reinfection of E. coli Cells (keep each library separate!)

1. Mix the eluted phages and 10 ml of E. coli JM110λ log-phase cells andincubate for 30 min at 37° C.

2. Collect the cells by centrifugation (5 min, 8000 rpm, SS34 rotor) andresuspend the pellet in 400 μl of LB/Amp medium.

3. Plate each suspension on one LB/Amp agar plate (Ø14.5 cm) andincubate overnight at 37° C.

After one round of panning populations of about 10⁵ individual clonesenriched towards binding clones are expected. For recombination thephagemid DNA has to isolated according to standard protocols, e.g.:

Preparation of Phagemid-DNA from Reinfected Cells

1. Resuspend reinfected E. coli cells in 20 ml of LB/Amp medium and useof 200 μl for the inoculation of 3 ml LB/Amp medium.

2. Incubate at 180 rpm at 37° C. for 1 hour.

3. Prepare the DNA using Jetquick Plasmid Miniprep Spin Kits (GenomedGmbH, Germany) according to the instructions of the supplier.

Using this method up to 30 μg of DNA can be isolated. For pROCOS4/7based libraries the phagemid size is 4.3 kb corresponding to a molecularweight of 2.9×10⁶ g/mol or round about 2×10¹¹ phagemid molecules/μg DNA.Therefore 10 μg of recombined DNA contains more molecules than thetheoretical number of different variants that can be created from 10⁵clones ((10⁵)²=10¹⁰).

For recombination the phagemid DNA of each preselected library is cutseparately e.g. with BpmI or alternatively with BsgI: Digestion of thephagemid DNA NEB3-buffer (1×) NaCl 100 mM Tris-HCl (pH 7.9)  50 mM MgCl₂ 10 mM Dithiothreitol  1 mM

1. Set up the following reaction phagemid-DNA 10 μg BpmI (2μ/μl)  5 μlNEB3 (10×)  4 μl BSA (1 mg/ml)  4 μ1 H₂O up to 40 μl

-   -   incubate at 37° C. for 5 hr.

2. Take an aliquots of 4 μl and run a 1% agarose gel to check thedigestion.

3. Extract with phenol, precipitate with ethanol and resuspend the DNAin TE-buffer.

Digested phagemid-DNAs are religated at high concentration (≧0.2 μg/μl)to favour formation of concatemers, packaged into λ phage particles andused for the transfection of E. coli cells (according to “Packaging ofBacteriophageλ DNA in vitro; protocol I” p.2.100-2.104, in: MolecularCloning

a Laboratory Manual, Sambrook et al. (eds.),

2. ed., 1989, Cold Spring Harbour Laboratory Press). Transfectedphagemids are separated by packaging reinfection using M13K07 helperphages (see above).

EXAMPLE 2

Cosmix-plexing using the one-tube-method (according to claims 11-17 and44)

2a) Library generation

Oligonucleotide sequences: NQNACOS-NGG (SEQ ID NO: 10):             BpiI  Bsg I 5′ GGCTCTGATGGAAGACGT↓GCAG C (NNB)₄NGG (NNB)₄TGC↓TCCAG A GTCTTC CTC                         ↑                     BpmI ↑ BpiI CTGTCG 3′NONACOS-NCT (SEQ ID NO: 11): 5′ GGCTCTGATGGAAGACGTGCAGC (NNB)₄NCT(NNB)₄TGCTCCAGAGTCTTCCTCC TGTCG 3′ NONACOS-NAM (SEQ ID NO: 12): 5′GGCTCTGATGGAAGACGTGCAGC (NNB)₄NAM (NNB)₄TGCTCCAGAGTCTTCCTC CTGTCG 3′NONACOS-NTS (SEQ ID NO: 13): 5′ GGCTCTGATGGAAGACGTGCAGC (NNB)₄NTS(NNB)₄TGCTCAGAGTCTTCCTC CTGTCG 3′

where N means: A, C, G or T; B: C, G or T; M: A or C and S: C, G or T.NONACOS-PCR-L (SEQ ID NO: 14):               BpiI  BsgI 5′GGCTCTGATGGAAGACGTGCAG 3′ NONACOS-PCR-R (SEQ ID NO: 15)              BpiI  BpmI 5′ CGACAGGAGGAAGACTCTGGAG 3′

BpiI (GAAGAC), BsgI (GTGCAG) and BpmI (CTCCAG) restriction enzymerecognition sites are marked in bold type. BpiI cutting sites are markedby arrows.

Important vector DNA-sequences of pROCOS5/3 (SEQ ID NO: 16):              BsgI         Eco47III 5′ GGCGAGCTCCCGT↓GCAG CG GTCTTCAGCGCTTGCCGTCTGACCGT,                      ↑  BpiIEco47III  BpiI                                |pIII→  AGCGCTGGAAGACGC↓TCCAG AGGGTACCCCGATATCAGAGCTGAA 3′                 BpmI  ↑

BpiI (GAAGAC), BsgI (GTGCAG) Eco47III (AGCGCT) and BpmI (CTCCAG)restriction enzyme recognition sites are marked in bold type. BpiIcutting sites are marked by arrows. The first codon of the maturepIII-Protein (GAA) is also indicated.

To create libraries according to the one-tube method the hypervariableoligos NONACOS-NGG, -NCT, -NAM and -NTS are amplified using thePCR-primer NONACOS-R and NONACOS-L as described in example 1, exceptthat the oligo-DNAs don't have to be kept separate.

After this pROCOS5/3-vector-DNA and double stranded (ds) oligo-DNA aredigested with BpiI and ligated at the same time according to thefollowing Protocol:

Digestion/Ligation

1. Set up the following mixture: pROCOS5/3 DNA 200 μg NONACOS-NGG, -NCT,-NAM 100 μl and -NTS ds DNA BpiI 200u buffer G (10×)  40 μl BSA (10mg/ml)  4 μl H₂O up to 400 μl

incubate at 37° C. for 2 hr, add 200 units T4 DNA ligase and continuethe incubation at 15 to 30° C. over night.

2. Take an aliquot of 3 μl and run a 1% agarose gel as a control.

This protocol favours the production of concatemers of the desiredproduct, that can be packaged for example in E. coli JM110λ cells byλ-packaging according to example 1.

2b) Recombination

For panning and recombination the same methods as described for example1 can be used, except that one library is used instead of four separatelibraries.

After this pROCOS5/3-vector-DNA and double stranded (ds) oligo-DNA aredigested with BpiI and ligated at the same time according to thefollowing Protocol:

Digestion/Ligation

1. Set up the following mixture: pROCOS5/3 DNA 200 μg NONACOS-NGG, -NCT,-NAM 100 μl and -NTS ds DNA BpiI 200u buffer G (10×)  40 μl BSA (10mg/ml)  4 μl H₂O up to 400 μl

incubate at 37° C. for 2 hr, add 200 units T4 DNA ligase and continuethe incubation at 15 to 30° C. over night.

2. Take an aliquot of 3 μl and run a 1% agarose gel as a control.

This protocol favours the production of concatemers of the desiredproduct, that can be packaged for example in E. coli JM110λ cells byλ-packaging according to example 1.

2b) Recombination

For panning and recombination the same methods as described for example1 can be used, except that one library is used instead of four separatelibraries.

1. A bank of genes, wherein said genes comprise a double stranded DNAsequence which is represented by the following formula of one of theirstrands:5′B₁B₂B₃ . . . B_(n)X_(n+1) . . . X_(n+a)Z_(n+a+1)X_(n+a+2)Z_(n+a+3) . .. X_(n+a+b)Q_(n+a+b+1) . . . Q_(n+a+b+j)3′wherein n, a, b and j areintegers andn>3, a>1, b>3 and j>1, wherein X_(n+1) . . . X_(n+a+b) is ahypervariable sequence and B, X, Z and Q represent adenine (A), cytosine(C), guanine (G) or thymine (T), (i) Z represents G or T at a G:T ratioof about 1:1, and/or (ii) Z represents C or T at a C:T ratio of about1:1, and/or (iii) Z represents A or G at a A:G ratio of about 1:1,and/or (iv) Z represents A or C at a A:C ratio of about 1:1, and whereinsubsequences B₁ . . . B_(n) and/or Q_(n+a+B+1) . . . Q_(n+a+b+j)represent recognition sites for restriction enzymes, and wherein therecognition sites are orientated such that their cleavage site uponcleavage generates a cohesive end including the two bases designated Z.2-50. (canceled)